Audiovisual perceptual evaluation of resynthesised speech movements

نویسندگان

Matthias Odisio

Gérard Bailly

چکیده

We have already presented a system that can track the 3D speech movements of a speaker’s face in a monocular video sequence. For that purpose, speaker-specific models of the face have been built, including a 3D shape model and several appearance models. In this paper, speech movements estimated using this system are perceptually evaluated. These movements are re-synthesised using a Point-Light (PL) rendering. They are paired with original audio signals degraded with white noise at several SNR. We study how much such PL movements enhance the identification of logatoms, and also to what extent they influence the perception of incongruent audio-visual logatoms. In a first experiment, the PL rendering is evaluated per se. Results seem to confirm other previous studies: though less efficient than actual video, PL speech enhances intelligibility and can reproduce the McGurk effect. In the second experiment, the movements have been estimated with our tracking framework with various appearance models. No salient differences are revealed between the performances of the appearance models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vision of tongue movements bias auditory speech perception.

Audiovisual speech perception is likely based on the association between auditory and visual information into stable audiovisual maps. Conflicting audiovisual inputs generate perceptual illusions such as the McGurk effect. Audiovisual mismatch effects could be either driven by the detection of violations in the standard audiovisual statistics or via the sensorimotor reconstruction of the distal...

متن کامل

Audiovisual cues benefit recognition of accented speech in noise but not perceptual adaptation

Perceptual adaptation allows humans to recognize different varieties of accented speech. We investigated whether perceptual adaptation to accented speech is facilitated if listeners can see a speaker's facial and mouth movements. In Study 1, participants listened to sentences in a novel accent and underwent a period of training with audiovisual or audio-only speech cues, presented in quiet or i...

متن کامل

Interaction of visual cues for prominence

The timing of both eyebrow and head movements of a talking face was varied systematically in a test sentence using an audiovisual speech synthesizer. The audio speech signal was unchanged over all sentences. 33 listeners were given the task of identifying the most prominent word in the test sentence. Results indicate that both eyebrow and head movements are powerful visual cues for prominence a...

متن کامل

Towards a lexical fuzzy logical model of perception: the time-course of audiovisual speech processing in word identification

This study investigates the time-course of information processing in both visual as well as in the auditory speech as used for word identification in face-to-face communication. It extends the limited previous research on this topic and provides a valuable database for future research in audiovisual speech perception. An evaluation of models of speech perception by ear and eye in their ability ...

متن کامل

Teaching and learning guide for audiovisual speech perception: A new approach and implications for clinical populations

When a speaker talks, the visible consequences of what they are saying can be seen. This auditory (the speech sound) and visual (movements of the lips and other articulators), or AV speech influences what listeners hear both in noisy listening environments and when auditory speech can easily be heard. Thought to be a cross‐cultural phenomenon that emerges early in typical language development, ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Audiovisual perceptual evaluation of resynthesised speech movements

نویسندگان

چکیده

منابع مشابه

Vision of tongue movements bias auditory speech perception.

Audiovisual cues benefit recognition of accented speech in noise but not perceptual adaptation

Interaction of visual cues for prominence

Towards a lexical fuzzy logical model of perception: the time-course of audiovisual speech processing in word identification

Teaching and learning guide for audiovisual speech perception: A new approach and implications for clinical populations

عنوان ژورنال:

اشتراک گذاری